Impact of social determinants of health on improving the LACE index for 30-day unplanned readmission prediction

Abstract Objective Early and accurate prediction of patients at risk of readmission is key to reducing costs and improving outcomes. LACE is a widely used score to predict 30-day readmissions. We examine whether adding social determinants of health (SDOH) to LACE can improve its predictive performance. Methods This is a retrospective study that included all inpatient encounters in the state of Maryland in 2019. We constructed predictive models by fitting Logistic Regression (LR) on LACE and different sets of SDOH predictors. We used the area under the curve (AUC) to evaluate discrimination and SHapley Additive exPlanations values to assess feature importance. Results Our study population included 316 558 patients of whom 35 431 (11.19%) patients were readmitted after 30 days. Readmitted patients had more challenges with individual-level SDOH and were more likely to reside in communities with poor SDOH conditions. Adding a combination of individual and community-level SDOH improved LACE performance from AUC = 0.698 (95% CI [0.695–0.7]; ref) to AUC = 0.708 (95% CI [0.705–0.71]; P < .001). The increase in AUC was highest in black patients (+1.6), patients aged 65 years or older (+1.4), and male patients (+1.4). Discussion We demonstrated the value of SDOH in improving the LACE index. Further, the additional predictive value of SDOH on readmission risk varies by subpopulations. Vulnerable populations like black patients and the elderly are likely to benefit more from the inclusion of SDOH in readmission prediction. Conclusion These findings provide potential SDOH factors that health systems and policymakers can target to reduce overall readmissions.


INTRODUCTION
Unplanned 30-day readmissions are a significant financial burden on the US health care system. As reported by the Center for Medicare and Medicaid Services (CMS), 2 million patients are readmitted annually in the United States, costing Medicare $26 billion. It is also estimated that $17 billion of that cost comes from potentially avoidable readmissions. 1 Reducing avoidable hospital readmissions has been a key focus of health policies and programs, such as the CMS Hospital Readmission Reduction Program, as the US health care system moves to value-based care. 2 The Centers for Disease Control and Prevention (CDC) defines SDOH as conditions in which people are born, grow, live, work, and age. 3 SDOH at the individual level can be related to a person's age, ethnic background, marital status, or other social factors. At the community level, SDOH are conditions in places where people live that affect their health and quality of life. The community social factors are often grouped into domains related to economic stability, education access, health care access, neighborhood and built environment, and social and community context. 3 A growing body of evidence indicates that both individual-and community-level SDOH factors impact a patient's readmission risk. 4,5 Consequently, incorporating SDOH into 30-day readmission prediction tools or interventions can potentially improve discharge planning and mitigate readmission rates. 5,6 Early and accurate prediction of patients at risk of readmission is key to reducing costs and improving health outcomes. An increasing number of predictive models have been developed to identify patients at high risk of 30-day readmissions. 7 With advances in predictive modeling techniques, new models tend to be complex and prioritize discrimination over interpretability. 7,8 The use of models that leverage simple algorithms can offer transparency in terms of feature interpretation, which is beneficial in clinical settings. 9 LACE index is a simple and widely validated risk score that is used to predict 30-day readmissions. 10 Nonetheless, LACE uses exclusively clinical factors (length of stay, acuity, comorbidities, and emergency visits in the last 6 months) and does not account for known contributors to readmission risk such as SDOH.
Previous literature reported an increase in the performance of LACE when incorporating additional variables from electronic health records and other sources. 11,12 An international study showed that incorporating a combination of demographics, markers of hospitalization severity, past healthcare utilization, and SDOH increases performance of readmission prediction. 11 The study only included 2 SDOH variables, namely, the requirement of financial assistance and admission to a subsidized hospital ward. It is unclear how much these SDOH variables alone contributed to improving the LACE model. In another study, 12 the authors trained an artificial neural network using a large number of electronic health records and census tract SDOH variables. While the artificial neural network model outperformed LACE, no attempt was made to augment LACE itself. Another study on an urban safety-net population found that augmenting LACE with the Area Deprivation Index and individual-level SDOH such as homelessness, learning barriers, and language preferences, increased its performance. 13 While these studies were successful in improving LACE, they provided limited details on the impact of SDOH variables alone and the importance of these factors in the augmented models. Previous work also did not compare the impact of individual-level SDOH and community-level SDOH on LACE.
In this study, our first objective was to assess the impact and importance of SDOH variables in improving the prediction perfor-mance of LACE for unplanned 30-day readmission. We examine the value of including different sets of SDOH predictors, namely, individual-level SDOH, community-level SDOH, and a combination of both. In our second objective, we investigated whether the added predictive value of SDOH varies by demographic subpopulations. To assess this hypothesis, we compare the performance of the models on 8 different demographic subgroups stratified by age, sex, and race.

Study setting and population
This is a retrospective analysis of Maryland's Healthcare Cost and Utilization Project (HCUP) data. 14 We used the State Inpatient Database (SID) and the State Emergency Department Database (SEDD) from HCUP. Community-level SDOH variables were extracted from the County Health Ranking (CHR) database, 15 which includes variables related to health outcomes and social and economic factors for each county in the United States.
Our study population included all inpatient encounters in Maryland from January to December 2019. The initial denominator of the study population included 422 736 patients. We dropped 72 621 patients aged less than 18 years old, because the original LACE index was developed for the adult population. We also excluded patients with residential zip codes outside Maryland; patients with inconsistent race records; invalid or missing admission type or length of stay (LOS); and invalid or missing county codes. In addition, we dropped encounters with missing patient ID, age, and race. After applying the inclusion and exclusion criteria, 316 558 participants were retained to conduct the final analysis ( Figure 1).

Predictors and outcome
The primary outcome of interest is 30-day all-cause unplanned readmission. We defined unplanned readmissions as subsequent inpatient admissions that occur within 31 days of the discharge date of an inpatient visit and have an admission type of emergency or urgent. We used the all-cause readmission variable provided in HCUP and updated readmission status based on the admission type of subsequent admissions. The predictors for the study included the LACE index and its components, individual-level SDOH, and communitylevel SDOH.
The LACE index is a widely validated model that predicts patients at risk for readmission or death within 30 days of discharge. The LACE index incorporates 4 covariates: LOS, admission type, Charlson Comorbidity Index (CCI), and emergency department (ED) visits within the last 6 months. We calculated the LACE score using inpatient and ED data from the SID and SSED databases. We used the LACE Index weights from the original study 10 to compute a LACE score for each patient.
For individual-level SDOH, we used a list of international classification of diseases (ICD-10) codes related to social risk factors to identify different SDOH factors that may influence a patient's readmission status (Supplementary Table S4). The ICD-10 list was adapted from the compendium of SDOH codes compiled by the Social Interventions Research and Evaluation Network (SIREN). 16 These variables include access to healthcare, homelessness, housing, stress, utilities, social connections and isolation, incarceration, clothing, and marital status.
For community-level SDOH, we used county-level variables from the CHR database. The CHR variables cover different health factor domains. We selected a subset of measures based on completeness and a literature review of community factors that were previously studied for their association with readmission rates. 4 Community-level variables were highly correlated. We assessed multicollinearity using the Pearson correlation coefficient and dropped highly correlated variables (absolute r value larger than 0.7). We also assessed variance inflation factor as an indicator of multicollinearity and required variables used in the model to have a variance inflation factor less than 10.

Model development and statistical analysis
We investigated the differences in demographics, individual SDOH, and community SDOH between patients with and without 30-day readmissions. We also compared readmissions by LOS and CCI. We used the chi-square test for categorical variables and t test for continuous variables to compare readmitted and nonreadmitted patients. We define statistical significance as P-value <.05.
We randomly split the dataset into 50% training and 50% testing sets. Given our large dataset, we performed an equal split to increase the size of our testing set. This can produce performance estimates that generalize better to unseen data. We constructed predictive models for the 30-day readmission outcome using Logistic Regression (LR) on different sets of predictors. First, we developed a base model using the LACE index components as predictors. Second, we added individual-level SDOH variables to the base model. Third, we developed a model using community-level SDOH variables and the LACE components. For community-level SDOH variables, we applied Lasso regression for feature selection on the training set and retained a subset of top predictors. Finally, we built a model using individual SDOH, the subset of community-level SDOH, and the LACE components.
For model building and selection, we performed 3-fold crossvalidation (CV) on the training set. The data partitioning for CV was stratified to account for class imbalance. This means that each fold of the CV split had the same percentage of 30-day readmissions as the original dataset. During each iteration of the CV, we kept 1 partition for testing and used the 2 remaining folds to search for the optimal model. We used a grid search to tune LR across a range of hyperparameter settings (ie, L1 and L2 penalty, class weights, and the inverse of regularization strength). The final generalization error was estimated by averaging area under the curve (AUC) scores over the held-out fold. The best LR hyperparameters were used to refit the classifiers on the whole training data. We evaluated performance on the 50% test data set and computed AUC, sensitivity, specificity, positive predictive value, negative predictive value, and Brier score. We used the Youden index to identify the optimized prediction threshold to balance sensitivity and specificity. Receiver operating characteristic curves and confusion matrices were used to illustrate the performance of the models. Finally, we used SHapley Additive exPlanations (SHAP) values for feature importance. SHAP is a game-theoretic approach to explain the predictions of machine learning models by computing the contribution of each feature to the predictions made by the model. 17 To account for within county variations, we conducted a sensitivity analysis. We fit 2 mixed-effects logit models to the training data for both the community-level SDOH and the all-level SDOH models.

Population characteristics and readmission
Our study population included 316 558 patients of which 35 431 (11.19%) patients were readmitted after 30 days ( Table 1). The mean age of the population was 55.9 (SD ¼ 21.1) years, 61.6% were women, 43.2% were married, 54.7% were white, and 33.0% were black. The mean LOS was 4.49 (SD ¼ 6.55) days, 36.7% had no comorbidities (CCI ¼ 0), and 7.4% had a CCI of 5 or higher. The average LACE score was 7.17 (SD ¼ 3.94) and 28.2% of patients were in the higher readmission LACE risk group (LACE 10).
Unadjusted comparisons in Table 1 show that patients with unplanned 30-day readmissions had a higher LOS of 7.00 (SD ¼ 9.33) compared to nonreadmitted patients 4.18 (SD ¼ 6.04). Males were less likely to be readmitted (47.7%) compared to females (52.3%). Readmitted patients were older with a mean age of 63.1 (SD ¼ 18.4) years compared to 55.0 (SD ¼ 21.2) years for nonreadmitted patients. Compared to patients without readmission, patients with unplanned readmissions were more likely to be white (58.9% vs 54.2%), and less likely to be married (38.5% vs 43.9%). Patients with a CCI of 5 or higher were more likely to be readmitted (20.4% vs 5.8%).
Overall, patients with unplanned 30-day readmissions had poor individual-level SDOH conditions compared to patients without readmissions. A higher proportion of readmitted patients had challenges with accessing healthcare (0.6% vs 0.2%); challenges with food security (0.8% vs 0.2%); problems related to legal circumstances or incarceration (1.7% vs 0.6%); challenges related to physical or emotional safety (1.3% vs 0.5%); isolation and lack of social connections (3.5% vs 1.3%); and challenges with stress (2.4% vs 0.7%). All individual-level SDOH differences between the 2 groups were statistically significant. Supplementary Table S1 shows the prevalence of individual-level SDOH by demographic subgroups. Table 2 presents unadjusted comparisons between readmitted and nonreadmitted patient groups. Patients with 30-day readmissions were more likely to reside in communities with poor SDOH. Readmitted patients came from counties with a higher proportion of smokers (14.2% vs 13.8%), higher firearm fatalities rate (13.6% vs

Model performance and interpretability
As presented in Supplementary Table S2, we found that building the predictive model on LACE components (ie, CCI, LOS, admission type, and the number of ED visits within the previous 6 months) performed better in our population (AUC ¼ 0.698) as opposed to using LACE as a continuous score (AUC ¼ 0.68) or using the recommended cutoff of 10 (AUC ¼ 0.62). Hence, we considered the LACE components model as our baseline to conduct all comparisons.
We assessed the performance of the predictive models for 30-day readmission on the test set (Table 3). Receiver operating characteristic curves were generated to compare the different model performances ( Figure 2). Confusion matrices for the models are shown in Supplementary Figure S1. The AUC for the LACE base model for the overall population was 0.698 (95% CI [0.695-0.7]; ref) and a Brier score equal to 0.22. Adding individual-SDOH to LACE increased slightly the AUC to 0.702 (95% CI [0.698-0.704]; P < .001) and improved the accuracy (Brier score ¼ 0.09). Adding community level SDOH to the base model had a small but significant improvement in its performance with an AUC ¼ 0.705 (95% CI [0.702-0.707]; P < .001) but did not have an effect on the Brier score. The combined model with LACE components, individual-level SDOH, and community-level SDOH had the highest improvement compared to the base model with an AUC equal to 0.708 (95% CI [0.705-0.71]; P < .001) and an improved Brier score equal to 0.09.
The additional predictive effects of SDOH on LACE were assessed in different demographics subgroups (Table 4). Overall, we noticed the same trend of improvement we observed in the general population. Nevertheless, the improvement in discrimination and accuracy varied by cohorts and by the type of SDOH variables added to the base models. Adding individual-level SDOH to LACE improved the Brier score in all cohorts but tended to have a minimal effect on discrimination. The community-level SDOH variables improved discrimination but not the accuracy or calibration. Incorporating both individual and community level SDOH in the LACE  (Figures 3-5). The x-axis in the SHAP summary plots denotes the impact of each prediction on the model represented by a dot. Higher SHAP values in the SHAP summary plot (right on the x-axis) indicate a higher readmission probability. The y-axis represents the predictors in descending order of importance. The gradient color indicates the original value for that variable: red or blue for categorical variables and a spectrum from blue to red for continuous variables. The topmost important individual SDOH predictors associated with higher readmissions for the general population cohort were: stress, isolation or lack of social connections, problems with access to health care, being married, problems with physical or emotional safety, and problems related to legal circumstances or incarceration ( Figure 3). The top community SDOH predictors associated with higher readmissions in the community model were: a higher rate of mental health practitioners, a higher percentage of the population who drives alone to work, and a higher percentage of the elderly population ( Figure 4). The community SDOH predictors associated with lower readmissions were: a higher percentage of uninsured adults, a higher percentage of female population, a higher percentage of alcohol-impaired deaths (Figure 4). In the LACE plus all-level SDOH model, the top predictors remained the community SDOH predictors we listed previously, followed by stress from individual-level SDOH ( Figure 5).

DISCUSSION
Predicting potentially avoidable readmission has been a key focus of recent research and policy. Social challenges and community factors have been found to impact health outcomes and utilization, including readmission rates. 6,20 In this study, we assessed whether incorporating SDOH in the LACE index can improve its prediction performance for unplanned 30-day readmission. Our findings show that the performance of LACE on the general population (AUC ¼ 0.698, ref) can be marginally improved by adding individual-level SDOH, community-level SDOH, or a combination of both (AUC ¼ 0.708, P < .001). Additionally, we found that SDOH improved LACE performance in certain demographic subgroups more than others. Notably, subpopulations that benefited most from adding SDOH are black patients, patients 65 or older, and male patients.
Our results indicate a small but significant improvement in AUC for the general population when adding SDOH to LACE. While the increase of 1% in discrimination may appear negligible, it does translate to the correct classification of an extra 3166 patients in our population. For a calibrated model, this can lead to predicting an additional 354 readmitted patients (ie, 11.19% of 3166 patients). Assuming an average readmission cost of $15 200 per patient, 21 incorporating SDOH in common readmission models may lead to saving approximately $5.4 Million annually in avoidable costs in the state of Maryland alone. 22 Our results also illustrate that the additional predictive effects of social determinants on 30-day readmission risk vary by subpopulations. The LACE models built for black, 65 or older, and male cohorts benefited most from adding SDOH. On the other hand, white patients, females, and patients aged 18-44 years benefited less. This could be the result of vulnerable groups such as black patients and the elderly being disproportionately affected by social  conditions and being less likely to be able to compensate for these factors. Additionally, male patients are at a much higher risk of readmission compared to women. 5 Our findings are in line with previous work, which found that adding SDOH to LACE for readmission prediction in an urban safety-net population increased its performance by 2% (from an AUC of 0.65 to 0.67). 13 Similar work on assessing the impact of individual and community-level SDOH on the HOSPITAL score reported an improved prediction of 30-day readmission for vulnerable patient subgroups including 65 or older, Medicaid, and obese patients. 22 Our second objective was to identify the top contributing SDOH variables to the improvement of LACE. We conducted a SHAP values analysis to evaluate the impact of the features in the different SDOH-augmented LACE models for the general population. The  top 3 individual-level SDOH features that contributed to the LACE plus individual-level SDOH model were variables related to challenges with stress, social connections and isolation, and access to healthcare. All 3 variables have been previously linked to 30-day readmissions as strong predictors. [23][24][25] The SHAP summary plot of the community-SDOH LACE model indicates that patients had a higher probability of nonreadmission at 30 days if they lived in communities with a higher percentage of uninsured adults, a higher percentage of females, and a higher percentage of alcohol-impaired driving deaths. On the other hand, patients had a higher probability of readmission if they resided in communities with a higher rate of mental health practitioners, a higher percentage of 65 or older population, and a higher percentage of long commute-drive to work. In fact, previous work has identified the community's age characteristics, the percentage of workers who have a long commute-drive to work, and the percentage of alcoholimpaired driving deaths, among the 19 most important community variables that predict readmission rates. 4 In the combined model, the LACE admission type variable was not part of the top 10 predictors. It is possible that access to ED is correlated with community factors. The ranking of the top community-SDOH features in the combined model were similar to the community-level model. The community-SDOH features also had a larger impact on the prediction compared to individual-level SDOH. Only stress from individual-level SDOH was among the top 10 most impactful features. This is possibly due to the lack of collec-tion of individual SDOH data at the point of care. 26,27 It is possible that the collection of more comprehensive individual SDOH level data can help improve LACE further.
This study has some limitations. First, this study used data from the state of Maryland only. It is likely that we did not capture readmissions of patients who were readmitted to hospitals in neighboring states, thus leading to an underrepresentation of the true readmission rate. Second, individual-level SDOH data is underreported in hospital discharge data and is likely to be missing for a large proportion of patients. Hence, the top important individual-level SDOH predictors may be different if SDOH were to be captured for all patients. Third, a county is a large geographical area and may not capture more granular SDOH information at the neighborhood level which may be more relevant to readmission. Lastly, the hierarchical nature of the county and individuallevel variables may limit the interpretation of the top features for the combined modeling. To address this limitation, we fit a mixed effect logit model to account for group-level variations (ie, county) which showed similar trends of model improvement when adding SDOH (Supplementary Table S3). Future work may focus on investigating the interaction between the different levels of SDOH and incorporating SDOH from other sources at the census tract level or at the neighborhood level. Further research may also investigate other subpopulations including Medicaid patients and conditions with the highest readmission rates (eg, congestive heart failure).

CONCLUSION
In this study, we demonstrated the value of SDOH in improving the LACE index, a widely used tool to predict the risk of 30-day readmission. We also showed that the additional predictive effects of SDOH on 30-day readmission risk vary by subpopulations. Vulnerable populations like black patients and patients 65 or older are likely to benefit more from the inclusion of SDOH in readmission prediction. We also conducted an examination of the top predictors which can be investigated further in future studies. These findings provide potential SDOH challenges that health systems and policymakers can address to reduce overall hospital readmissions. Future work may examine the collection of SDOH during hospital admission to inform readmission prediction models at discharge.